Goto

Collaborating Authors

 influenza virus


Flu Is Relentless. Crispr Might Be Able to Shut It Down

WIRED

Innovative research into the gene-editing tool targets influenza's ability to replicate--stopping it in its tracks. As he addressed an audience of virologists from China, Australia, and Singapore at October's Pandemic Research Alliance Symposium, Wei Zhao introduced an eye-catching idea. The gene-editing technology Crispr is best known for delivering groundbreaking new therapies for rare diseases, tweaking or knocking out rogue genes in conditions ranging from sickle cell disease to hemophilia . But Zhao and his colleagues at Melbourne's Peter Doherty Institute for Infection and Immunity have envisioned a new application. They believe Crispr could be tailored to create a next-generation treatment for influenza, whether that's the seasonal strains which plague both the northern and southern hemispheres on an annual basis, or the worrisome new variants in birds and other wildlife that might trigger the next pandemic.


Leveraging Large Language Models to Predict Antibody Biological Activity Against Influenza A Hemagglutinin

Barkan, Ella, Siddiqui, Ibrahim, Cheng, Kevin J., Golts, Alex, Shoshan, Yoel, Weber, Jeffrey K., Mota, Yailin Campos, Ozery-Flato, Michal, Sautto, Giuseppe A.

arXiv.org Artificial Intelligence

Monoclonal antibodies (mAbs) represent one of the most prevalent FDA-approved modalities for treating autoimmune diseases, infectious diseases, and cancers. However, discovery and development of therapeutic antibodies remains a time-consuming and expensive process. Recent advancements in machine learning (ML) and artificial intelligence (AI) have shown significant promise in revolutionizing antibody discovery and optimization. In particular, models that predict antibody biological activity enable in-silico evaluation of binding and functional properties; such models can prioritize antibodies with the highest likelihoods of success in costly and time-intensive laboratory testing procedures. We here explore an AI model for predicting the binding and receptor blocking activity of antibodies against influenza A hemagglutinin (HA) antigens. Our present model is developed with the MAMMAL framework for biologics discovery to predict antibody-antigen interactions using only sequence information. To evaluate the model's performance, we tested it under various data split conditions to mimic real-world scenarios. Our models achieved an AUROC $\geq$ 0.91 for predicting the activity of existing antibodies against seen HAs and an AUROC of 0.9 for unseen HAs. For novel antibody activity prediction, the AUROC was 0.73, which further declined to 0.63-0.66 under stringent constraints on similarity to existing antibodies. These results demonstrate the potential of AI foundation models to transform antibody design by reducing dependence on extensive laboratory testing and enabling more efficient prioritization of antibody candidates. Moreover, our findings emphasize the critical importance of diverse and comprehensive antibody datasets to improve the generalization of prediction models, particularly for novel antibody development.


MC-NN: An End-to-End Multi-Channel Neural Network Approach for Predicting Influenza A Virus Hosts and Antigenic Types

Xu, Yanhua, Wojtczak, Dominik

arXiv.org Artificial Intelligence

Influenza poses a significant threat to public health, particularly among the elderly, young children, and people with underlying dis-eases. The manifestation of severe conditions, such as pneumonia, highlights the importance of preventing the spread of influenza. An accurate and cost-effective prediction of the host and antigenic sub-types of influenza A viruses is essential to addressing this issue, particularly in resource-constrained regions. In this study, we propose a multi-channel neural network model to predict the host and antigenic subtypes of influenza A viruses from hemagglutinin and neuraminidase protein sequences. Our model was trained on a comprehensive data set of complete protein sequences and evaluated on various test data sets of complete and incomplete sequences. The results demonstrate the potential and practicality of using multi-channel neural networks in predicting the host and antigenic subtypes of influenza A viruses from both full and partial protein sequences.


Predicting Influenza A Viral Host Using PSSM and Word Embeddings

Xu, Yanhua, Wojtczak, Dominik

arXiv.org Artificial Intelligence

The rapid mutation of the influenza virus threatens public health. Reassortment among viruses with different hosts can lead to a fatal pandemic. However, it is difficult to detect the original host of the virus during or after an outbreak as influenza viruses can circulate between different species. Therefore, early and rapid detection of the viral host would help reduce the further spread of the virus. We use various machine learning models with features derived from the position-specific scoring matrix (PSSM) and features learned from word embedding and word encoding to infer the origin host of viruses. The results show that the performance of the PSSM-based model reaches the MCC around 95%, and the F1 around 96%. The MCC obtained using the model with word embedding is around 96%, and the F1 is around 97%.


'Time-traveling' pathogens trapped for thousands of years in melting permafrost could spark next pandemic and wipe out microbes crucial to our planet

Daily Mail - Science & tech

Scientists fear'time-traveling' pathogens could be leaked into the world as their icy prison in permafrost is melting - and they could spark the next planet and destroy the environment. Ancient viruses, sealed in permafrost for thousands of years, could survive and evolve to become the dominant free-living species- killing up to one-third of bacteria-like hosts. The stark revelation was made by researchers at the European Commission Joint Research Center, who used computer simulations to find about three percent of virus-like pathogens became dominant after being released from the ice. The new findings suggest that the risks posed by time-traveling pathogens – so far confined to science fiction stories – could be powerful drivers of ecological change and threats to human health. Scientists fear'time-traveling' pathogens could be leaked into the world as their icy prison in permafrost is melting - and their escape would be detrimental to the environment.


Can large language models democratize access to dual-use biotechnology?

Soice, Emily H., Rocha, Rafael, Cordova, Kimberlee, Specter, Michael, Esvelt, Kevin M.

arXiv.org Artificial Intelligence

Large language models (LLMs) such as those embedded in 'chatbots' are accelerating and democratizing research by providing comprehensible information and expertise from many different fields. However, these models may also confer easy access to dual-use technologies capable of inflicting great harm. To evaluate this risk, the 'Safeguarding the Future' course at MIT tasked non-scientist students with investigating whether LLM chatbots could be prompted to assist non-experts in causing a pandemic. In one hour, the chatbots suggested four potential pandemic pathogens, explained how they can be generated from synthetic DNA using reverse genetics, supplied the names of DNA synthesis companies unlikely to screen orders, identified detailed protocols and how to troubleshoot them, and recommended that anyone lacking the skills to perform reverse genetics engage a core facility or contract research organization. Collectively, these results suggest that LLMs will make pandemic-class agents widely accessible as soon as they are credibly identified, even to people with little or no laboratory training. Promising nonproliferation measures include pre-release evaluations of LLMs by third parties, curating training datasets to remove harmful concepts, and verifiably screening all DNA generated by synthesis providers or used by contract research organizations and robotic cloud laboratories to engineer organisms or viruses.


InForecaster: Forecasting Influenza Hemagglutinin Mutations Through the Lens of Anomaly Detection

Garjani, Ali, Chegini, Atoosa Malemir, Salehi, Mohammadreza, Tabibzadeh, Alireza, Yousefi, Parastoo, Razizadeh, Mohammad Hossein, Esghaei, Moein, Esghaei, Maryam, Rohban, Mohammad Hossein

arXiv.org Artificial Intelligence

The influenza virus hemagglutinin is an important part of the virus attachment to the host cells. The hemagglutinin proteins are one of the genetic regions of the virus with a high potential for mutations. Due to the importance of predicting mutations in producing effective and low-cost vaccines, solutions that attempt to approach this problem have recently gained a significant attention. A historical record of mutations have been used to train predictive models in such solutions. However, the imbalance between mutations and the preserved proteins is a big challenge for the development of such models that needs to be addressed. Here, we propose to tackle this challenge through anomaly detection (AD). AD is a well-established field in Machine Learning (ML) that tries to distinguish unseen anomalies from the normal patterns using only normal training samples. By considering mutations as the anomalous behavior, we could benefit existing rich solutions in this field that have emerged recently. Such methods also fit the problem setup of extreme imbalance between the number of unmutated vs. mutated training samples. Motivated by this formulation, our method tries to find a compact representation for unmutated samples while forcing anomalies to be separated from the normal ones. This helps the model to learn a shared unique representation between normal training samples as much as possible, which improves the discernibility and detectability of mutated samples from the unmutated ones at the test time. We conduct a large number of experiments on four publicly available datasets, consisting of 3 different hemagglutinin protein datasets, and one SARS-CoV-2 dataset, and show the effectiveness of our method through different standard criteria.


Multi-channel neural networks for predicting influenza A virus hosts and antigenic types

Xu, Yanhua, Wojtczak, Dominik

arXiv.org Artificial Intelligence

Influenza occurs every season and occasionally causes pandemics. Despite its low mortality rate, influenza is a major public health concern, as it can be complicated by severe diseases like pneumonia. A fast, accurate and low-cost method to predict the origin host and subtype of influenza viruses could help reduce virus transmission and benefit resource-poor areas. In this work, we propose multi-channel neural networks to predict antigenic types and hosts of influenza A viruses with hemagglutinin and neuraminidase protein sequences. An integrated data set containing complete protein sequences were used to produce a pre-trained model, and two other data sets were used for testing the model's performance. One test set contained complete protein sequences, and another test set contained incomplete protein sequences. The results suggest that multi-channel neural networks are applicable and promising for predicting influenza A virus hosts and antigenic subtypes with complete and partial protein sequences.


Graph Convolutional Neural Networks to Analyze Complex Carbohydrates

#artificialintelligence

Graph convolutional neural networks (GCNs) have attracted increasing amounts of attention over the last couple of years, with more and more disciplines finding use for them. This has also been extended into the life sciences, as GCNs have been used to analyze proteins, drugs, and of course biological networks. One key advantage of GCNs that has enabled this expansion is their ability to natively work with nonlinear data formats, in contrast to more linear data structures such as in natural languages. Because of this feature, we also implemented GCNs for our own topic of interest, the study of complex carbohydrates or glycans. Glycans are ubiquitous in biology, decorating every cell and playing key roles in processes such as viral infection or tumor immune evasion.


Algorithmic Bio-surveillance For Precise Spatio-temporal Prediction of Zoonotic Emergence

Dhanoa, Jaideep, Manicassamy, Balaji, Chattopadhyay, Ishanu

arXiv.org Machine Learning

Viral zoonoses have emerged as the key drivers of recent pandemics. Human infection by zoonotic viruses are either spillover events -- isolated infections that fail to cause a widespread contagion -- or species jumps, where successful adaptation to the new host leads to a pandemic. Despite expensive bio-surveillance efforts, historically emergence response has been reactive, and post-hoc. Here we use machine inference to demonstrate a high accuracy predictive bio-surveillance capability, designed to pro-actively localize an impending species jump via automated interrogation of massive sequence databases of viral proteins. Our results suggest that a jump might not purely be the result of an isolated unfortunate cross-infection localized in space and time; there are subtle yet detectable patterns of genotypic changes accumulating in the global viral population leading up to emergence. Using tens of thousands of protein sequences simultaneously, we train models that track maximum achievable accuracy for disambiguating host tropism from the primary structure of surface proteins, and show that the inverse classification accuracy is a quantitative indicator of jump risk. We validate our claim in the context of the 2009 swine flu outbreak, and the 2004 emergence of H5N1 subspecies of Influenza A from avian reservoirs; illustrating that interrogation of the global viral population can unambiguously track a near monotonic risk elevation over several preceding years leading to eventual emergence.